Overview

Dataset statistics

Number of variables16
Number of observations10639
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 MiB
Average record size in memory136.0 B

Variable types

DateTime2
Numeric9
Categorical5

Alerts

mta_tax has constant value "0.5"Constant
improvement_surcharge has constant value "0.3"Constant
RatecodeID is highly overall correlated with fare_amount and 1 other fieldsHigh correlation
duration_minutes is highly overall correlated with fare_amount and 2 other fieldsHigh correlation
fare_amount is highly overall correlated with RatecodeID and 3 other fieldsHigh correlation
tip_amount is highly overall correlated with total_amountHigh correlation
total_amount is highly overall correlated with duration_minutes and 3 other fieldsHigh correlation
trip_distance is highly overall correlated with RatecodeID and 3 other fieldsHigh correlation
RatecodeID is highly imbalanced (99.6%)Imbalance
payment_type is highly imbalanced (52.7%)Imbalance
duration_minutes is highly skewed (γ1 = 20.6986304)Skewed
tip_amount has 3598 (33.8%) zerosZeros
tolls_amount has 10327 (97.1%) zerosZeros

Reproduction

Analysis started2025-12-09 11:34:57.222316
Analysis finished2025-12-09 11:35:09.277819
Duration12.06 seconds
Software versionydata-profiling vv4.18.0
Download configurationconfig.json

Variables

Distinct10635
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size166.2 KiB
Minimum2017-01-01 00:08:25
Maximum2017-12-31 23:45:30
Invalid dates0
Invalid dates (%)0.0%
2025-12-09T08:35:09.357364image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:09.505197image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct10634
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size166.2 KiB
Minimum2017-01-01 00:17:20
Maximum2017-12-31 23:49:24
Invalid dates0
Invalid dates (%)0.0%
2025-12-09T08:35:09.633451image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:09.756931image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

passenger_count
Real number (ℝ)

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.636244
Minimum0
Maximum6
Zeros9
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size166.2 KiB
2025-12-09T08:35:09.844733image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.2574375
Coefficient of variation (CV)0.76849022
Kurtosis3.8172941
Mean1.636244
Median Absolute Deviation (MAD)0
Skewness2.1823746
Sum17408
Variance1.5811491
MonotonicityNot monotonic
2025-12-09T08:35:09.913305image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
17491
70.4%
21644
 
15.5%
5538
 
5.1%
3455
 
4.3%
6283
 
2.7%
4219
 
2.1%
09
 
0.1%
ValueCountFrequency (%)
09
 
0.1%
17491
70.4%
21644
 
15.5%
3455
 
4.3%
4219
 
2.1%
5538
 
5.1%
6283
 
2.7%
ValueCountFrequency (%)
6283
 
2.7%
5538
 
5.1%
4219
 
2.1%
3455
 
4.3%
21644
 
15.5%
17491
70.4%
09
 
0.1%

trip_distance
Real number (ℝ)

High correlation 

Distinct1060
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7145202
Minimum0
Maximum30.83
Zeros44
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size166.2 KiB
2025-12-09T08:35:10.011357image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.5
Q11
median1.72
Q33.2
95-th percentile8.96
Maximum30.83
Range30.83
Interquartile range (IQR)2.2

Descriptive statistics

Standard deviation2.8447256
Coefficient of variation (CV)1.0479663
Kurtosis10.93103
Mean2.7145202
Median Absolute Deviation (MAD)0.88
Skewness2.7688079
Sum28879.78
Variance8.0924636
MonotonicityNot monotonic
2025-12-09T08:35:10.114647image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1252
 
2.4%
1.1235
 
2.2%
0.8220
 
2.1%
0.9215
 
2.0%
1.2209
 
2.0%
0.7203
 
1.9%
1.3185
 
1.7%
1.4183
 
1.7%
0.6175
 
1.6%
1.5172
 
1.6%
Other values (1050)8590
80.7%
ValueCountFrequency (%)
044
0.4%
0.013
 
< 0.1%
0.024
 
< 0.1%
0.032
 
< 0.1%
0.042
 
< 0.1%
0.061
 
< 0.1%
0.072
 
< 0.1%
0.081
 
< 0.1%
0.115
 
0.1%
0.111
 
< 0.1%
ValueCountFrequency (%)
30.831
< 0.1%
27.881
< 0.1%
27.341
< 0.1%
26.541
< 0.1%
25.861
< 0.1%
25.81
< 0.1%
24.891
< 0.1%
24.611
< 0.1%
24.11
< 0.1%
23.671
< 0.1%

RatecodeID
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size602.6 KiB
1
10636 
4
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10639
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
110636
> 99.9%
43
 
< 0.1%

Length

2025-12-09T08:35:10.219654image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-09T08:35:10.276275image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
110636
> 99.9%
43
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
110636
> 99.9%
43
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)10639
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
110636
> 99.9%
43
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)10639
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
110636
> 99.9%
43
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)10639
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
110636
> 99.9%
43
 
< 0.1%

PULocationID
Real number (ℝ)

Distinct123
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.64677
Minimum4
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.2 KiB
2025-12-09T08:35:10.346661image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile48
Q1113
median161
Q3231
95-th percentile249
Maximum265
Range261
Interquartile range (IQR)118

Descriptive statistics

Standard deviation66.118582
Coefficient of variation (CV)0.41157741
Kurtosis-0.95289156
Mean160.64677
Median Absolute Deviation (MAD)68
Skewness-0.19928603
Sum1709121
Variance4371.6669
MonotonicityNot monotonic
2025-12-09T08:35:10.463912image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
230410
 
3.9%
48410
 
3.9%
234405
 
3.8%
79401
 
3.8%
162396
 
3.7%
161392
 
3.7%
237365
 
3.4%
186349
 
3.3%
170335
 
3.1%
163313
 
2.9%
Other values (113)6863
64.5%
ValueCountFrequency (%)
430
0.3%
717
 
0.2%
122
 
< 0.1%
1374
0.7%
142
 
< 0.1%
175
 
< 0.1%
2422
 
0.2%
2516
 
0.2%
281
 
< 0.1%
291
 
< 0.1%
ValueCountFrequency (%)
2652
 
< 0.1%
264175
1.6%
263166
1.6%
26264
 
0.6%
26151
 
0.5%
2608
 
0.1%
2569
 
0.1%
25523
 
0.2%
249311
2.9%
2471
 
< 0.1%

DOLocationID
Real number (ℝ)

Distinct194
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158.11495
Minimum4
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.2 KiB
2025-12-09T08:35:10.572533image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile41
Q1100
median161
Q3233
95-th percentile261
Maximum265
Range261
Interquartile range (IQR)133

Descriptive statistics

Standard deviation72.645088
Coefficient of variation (CV)0.45944476
Kurtosis-1.0902426
Mean158.11495
Median Absolute Deviation (MAD)70
Skewness-0.25363414
Sum1682185
Variance5277.3087
MonotonicityNot monotonic
2025-12-09T08:35:10.678237image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48357
 
3.4%
236319
 
3.0%
170319
 
3.0%
230306
 
2.9%
79306
 
2.9%
186290
 
2.7%
239279
 
2.6%
142276
 
2.6%
141262
 
2.5%
237249
 
2.3%
Other values (184)7676
72.1%
ValueCountFrequency (%)
463
0.6%
762
0.6%
92
 
< 0.1%
105
 
< 0.1%
111
 
< 0.1%
126
 
0.1%
1386
0.8%
1412
 
0.1%
153
 
< 0.1%
162
 
< 0.1%
ValueCountFrequency (%)
26512
 
0.1%
264145
1.4%
263211
2.0%
262143
1.3%
26138
 
0.4%
26015
 
0.1%
2593
 
< 0.1%
25716
 
0.2%
25642
 
0.4%
25566
 
0.6%

payment_type
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size602.6 KiB
1
7340 
2
3227 
3
 
56
4
 
16

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10639
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
17340
69.0%
23227
30.3%
356
 
0.5%
416
 
0.2%

Length

2025-12-09T08:35:10.776688image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-09T08:35:10.840403image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
17340
69.0%
23227
30.3%
356
 
0.5%
416
 
0.2%

Most occurring characters

ValueCountFrequency (%)
17340
69.0%
23227
30.3%
356
 
0.5%
416
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)10639
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
17340
69.0%
23227
30.3%
356
 
0.5%
416
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)10639
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
17340
69.0%
23227
30.3%
356
 
0.5%
416
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)10639
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
17340
69.0%
23227
30.3%
356
 
0.5%
416
 
0.2%

fare_amount
Real number (ℝ)

High correlation 

Distinct128
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.923912
Minimum2.5
Maximum85.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.2 KiB
2025-12-09T08:35:10.932126image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum2.5
5-th percentile4.5
Q16.5
median9.5
Q314.25
95-th percentile29.5
Maximum85.5
Range83
Interquartile range (IQR)7.75

Descriptive statistics

Standard deviation8.3527054
Coefficient of variation (CV)0.70050042
Kurtosis7.4730187
Mean11.923912
Median Absolute Deviation (MAD)3.5
Skewness2.2713603
Sum126858.5
Variance69.767687
MonotonicityNot monotonic
2025-12-09T08:35:11.035908image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6555
 
5.2%
6.5513
 
4.8%
5.5507
 
4.8%
7506
 
4.8%
7.5487
 
4.6%
5470
 
4.4%
8.5456
 
4.3%
8436
 
4.1%
9435
 
4.1%
9.5409
 
3.8%
Other values (118)5865
55.1%
ValueCountFrequency (%)
2.558
 
0.5%
347
 
0.4%
3.5147
 
1.4%
4268
2.5%
4.5357
3.4%
5470
4.4%
5.5507
4.8%
6555
5.2%
6.5513
4.8%
7506
4.8%
ValueCountFrequency (%)
85.51
< 0.1%
801
< 0.1%
781
< 0.1%
761
< 0.1%
731
< 0.1%
72.51
< 0.1%
70.51
< 0.1%
67.51
< 0.1%
662
< 0.1%
64.51
< 0.1%

extra
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size623.4 KiB
0.5
7086 
1.0
3553 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters31917
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.5
2nd row0.5
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.57086
66.6%
1.03553
33.4%

Length

2025-12-09T08:35:11.131303image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-09T08:35:11.188346image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0.57086
66.6%
1.03553
33.4%

Most occurring characters

ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
57086
22.2%
13553
 
11.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)31917
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
57086
22.2%
13553
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)31917
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
57086
22.2%
13553
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)31917
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
57086
22.2%
13553
 
11.1%

mta_tax
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size623.4 KiB
0.5
10639 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters31917
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.5
2nd row0.5
3rd row0.5
4th row0.5
5th row0.5

Common Values

ValueCountFrequency (%)
0.510639
100.0%

Length

2025-12-09T08:35:11.260406image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-09T08:35:11.313200image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0.510639
100.0%

Most occurring characters

ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
510639
33.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)31917
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
510639
33.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)31917
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
510639
33.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)31917
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
510639
33.3%

tip_amount
Real number (ℝ)

High correlation  Zeros 

Distinct529
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.7493167
Minimum0
Maximum28
Zeros3598
Zeros (%)33.8%
Negative0
Negative (%)0.0%
Memory size166.2 KiB
2025-12-09T08:35:11.381806image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.46
Q32.46
95-th percentile5.46
Maximum28
Range28
Interquartile range (IQR)2.46

Descriptive statistics

Standard deviation2.0059724
Coefficient of variation (CV)1.1467177
Kurtosis10.398365
Mean1.7493167
Median Absolute Deviation (MAD)1.46
Skewness2.2919282
Sum18610.98
Variance4.0239251
MonotonicityNot monotonic
2025-12-09T08:35:11.888515image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03598
33.8%
1708
 
6.7%
2379
 
3.6%
1.5164
 
1.5%
1.66115
 
1.1%
3114
 
1.1%
1.96105
 
1.0%
2.06104
 
1.0%
1.45102
 
1.0%
1.46102
 
1.0%
Other values (519)5148
48.4%
ValueCountFrequency (%)
03598
33.8%
0.016
 
0.1%
0.021
 
< 0.1%
0.041
 
< 0.1%
0.081
 
< 0.1%
0.14
 
< 0.1%
0.121
 
< 0.1%
0.25
 
< 0.1%
0.262
 
< 0.1%
0.341
 
< 0.1%
ValueCountFrequency (%)
281
< 0.1%
251
< 0.1%
18.561
< 0.1%
15.951
< 0.1%
15.321
< 0.1%
152
< 0.1%
14.861
< 0.1%
14.841
< 0.1%
14.761
< 0.1%
14.461
< 0.1%

tolls_amount
Real number (ℝ)

Zeros 

Distinct15
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.16687001
Minimum0
Maximum17.28
Zeros10327
Zeros (%)97.1%
Negative0
Negative (%)0.0%
Memory size166.2 KiB
2025-12-09T08:35:11.973356image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum17.28
Range17.28
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.98181906
Coefficient of variation (CV)5.883736
Kurtosis44.538191
Mean0.16687001
Median Absolute Deviation (MAD)0
Skewness6.234667
Sum1775.33
Variance0.96396868
MonotonicityNot monotonic
2025-12-09T08:35:12.047168image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
010327
97.1%
5.76220
 
2.1%
5.5471
 
0.7%
2.646
 
0.1%
2.545
 
< 0.1%
11.521
 
< 0.1%
2.161
 
< 0.1%
8.51
 
< 0.1%
17.281
 
< 0.1%
5.491
 
< 0.1%
Other values (5)5
 
< 0.1%
ValueCountFrequency (%)
010327
97.1%
2.161
 
< 0.1%
2.545
 
< 0.1%
2.646
 
0.1%
2.71
 
< 0.1%
5.161
 
< 0.1%
5.491
 
< 0.1%
5.5471
 
0.7%
5.76220
 
2.1%
6.321
 
< 0.1%
ValueCountFrequency (%)
17.281
 
< 0.1%
16.621
 
< 0.1%
11.521
 
< 0.1%
10.51
 
< 0.1%
8.51
 
< 0.1%
6.321
 
< 0.1%
5.76220
2.1%
5.5471
 
0.7%
5.491
 
< 0.1%
5.161
 
< 0.1%

improvement_surcharge
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size623.4 KiB
0.3
10639 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters31917
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.3
2nd row0.3
3rd row0.3
4th row0.3
5th row0.3

Common Values

ValueCountFrequency (%)
0.310639
100.0%

Length

2025-12-09T08:35:12.140616image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-09T08:35:12.194350image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0.310639
100.0%

Most occurring characters

ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
310639
33.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)31917
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
310639
33.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)31917
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
310639
33.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)31917
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
010639
33.3%
.10639
33.3%
310639
33.3%

total_amount
Real number (ℝ)

High correlation 

Distinct880
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.308827
Minimum3.8
Maximum111.38
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.2 KiB
2025-12-09T08:35:12.275498image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum3.8
5-th percentile6.3
Q18.8
median12.3
Q317.8
95-th percentile37.079
Maximum111.38
Range107.58
Interquartile range (IQR)9

Descriptive statistics

Standard deviation10.066952
Coefficient of variation (CV)0.65759137
Kurtosis7.7735122
Mean15.308827
Median Absolute Deviation (MAD)4
Skewness2.3591718
Sum162870.61
Variance101.34353
MonotonicityNot monotonic
2025-12-09T08:35:12.438342image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.3257
 
2.4%
8.3235
 
2.2%
7.8234
 
2.2%
8.8229
 
2.2%
6.8229
 
2.2%
10.3223
 
2.1%
10.8209
 
2.0%
9.3206
 
1.9%
9.8199
 
1.9%
6.3182
 
1.7%
Other values (870)8436
79.3%
ValueCountFrequency (%)
3.834
 
0.3%
4.333
 
0.3%
4.561
 
< 0.1%
4.751
 
< 0.1%
4.861
0.6%
52
 
< 0.1%
5.152
 
< 0.1%
5.162
 
< 0.1%
5.282
 
< 0.1%
5.399
0.9%
ValueCountFrequency (%)
111.381
< 0.1%
92.841
< 0.1%
91.91
< 0.1%
89.441
< 0.1%
89.161
< 0.1%
88.561
< 0.1%
86.761
< 0.1%
85.061
< 0.1%
83.561
< 0.1%
80.91
< 0.1%

duration_minutes
Real number (ℝ)

High correlation  Skewed 

Distinct2190
Distinct (%)20.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.324756
Minimum0.033333333
Maximum1439.55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.2 KiB
2025-12-09T08:35:12.559464image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0.033333333
5-th percentile2.9833333
Q16.55
median10.866667
Q317.316667
95-th percentile31.901667
Maximum1439.55
Range1439.5167
Interquartile range (IQR)10.766667

Descriptive statistics

Standard deviation66.211797
Coefficient of variation (CV)4.0559133
Kurtosis436.62145
Mean16.324756
Median Absolute Deviation (MAD)4.9666667
Skewness20.69863
Sum173679.08
Variance4384.002
MonotonicityNot monotonic
2025-12-09T08:35:12.665142image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.38333333320
 
0.2%
7.56666666718
 
0.2%
3.78333333318
 
0.2%
7.0518
 
0.2%
9.38333333318
 
0.2%
8.56666666718
 
0.2%
6.118
 
0.2%
13.3666666717
 
0.2%
5.31666666717
 
0.2%
10.1517
 
0.2%
Other values (2180)10460
98.3%
ValueCountFrequency (%)
0.033333333334
< 0.1%
0.057
0.1%
0.066666666671
 
< 0.1%
0.083333333334
< 0.1%
0.15
< 0.1%
0.13333333334
< 0.1%
0.153
< 0.1%
0.16666666672
 
< 0.1%
0.18333333332
 
< 0.1%
0.21
 
< 0.1%
ValueCountFrequency (%)
1439.551
< 0.1%
1439.151
< 0.1%
1438.651
< 0.1%
1438.551
< 0.1%
1438.4666671
< 0.1%
1438.2666671
< 0.1%
1436.51
< 0.1%
1435.81
< 0.1%
1433.9833331
< 0.1%
1432.9166671
< 0.1%

Interactions

2025-12-09T08:35:08.044243image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:57.898740image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:59.849390image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:01.985444image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:03.803357image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:04.891870image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.695993image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.468736image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.286940image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:08.131501image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:58.247096image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:00.202909image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:02.067803image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:03.885075image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:04.973586image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.793098image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.560642image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.371962image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:08.232008image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:58.386812image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:00.286503image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:02.423871image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:04.240743image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.053711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.876229image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.657800image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.456460image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:08.319943image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:58.606680image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:00.644560image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:02.510280image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:04.340019image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.130894image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.959056image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.745365image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.541881image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:08.407483image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:58.954353image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:00.741911image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:02.591997image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:04.436650image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.211081image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.056098image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.834390image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.621539image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:08.489277image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:59.082163image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:01.095155image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:02.943009image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:04.533705image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.284327image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.135592image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.921201image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.710197image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:08.591249image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:59.294684image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:01.188986image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:03.022939image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:04.623901image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.386935image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.213070image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.009881image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.790925image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:08.708596image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:59.400282image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:01.543763image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:03.366004image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:04.709181image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.484503image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.300571image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.097314image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.879195image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:08.841207image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:34:59.758465image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:01.630524image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:03.449932image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:04.798775image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:05.585411image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:06.380685image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.193312image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-09T08:35:07.957254image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-12-09T08:35:12.758564image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
DOLocationIDPULocationIDRatecodeIDduration_minutesextrafare_amountpassenger_countpayment_typetip_amounttolls_amounttotal_amounttrip_distance
DOLocationID1.0000.1090.030-0.0610.096-0.0690.0090.036-0.003-0.002-0.062-0.071
PULocationID0.1091.0000.038-0.0580.101-0.070-0.0060.021-0.014-0.047-0.062-0.070
RatecodeID0.0300.0381.0000.0000.0000.6780.0000.0000.0000.0610.4530.525
duration_minutes-0.061-0.0580.0001.0000.0140.9640.0200.0000.3820.2310.9430.839
extra0.0960.1010.0000.0141.0000.0410.0260.0070.0140.0000.0510.116
fare_amount-0.069-0.0700.6780.9640.0411.0000.0210.0400.4010.2670.9780.937
passenger_count0.009-0.0060.0000.0200.0260.0211.0000.025-0.0250.0110.0130.028
payment_type0.0360.0210.0000.0000.0070.0400.0251.0000.1930.0170.0870.031
tip_amount-0.003-0.0140.0000.3820.0140.401-0.0250.1931.0000.1760.5470.387
tolls_amount-0.002-0.0470.0610.2310.0000.2670.0110.0170.1761.0000.2800.271
total_amount-0.062-0.0620.4530.9430.0510.9780.0130.0870.5470.2801.0000.915
trip_distance-0.071-0.0700.5250.8390.1160.9370.0280.0310.3870.2710.9151.000

Missing values

2025-12-09T08:35:09.018203image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-12-09T08:35:09.170448image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

tpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amountduration_minutes
42017-04-15 23:32:202017-04-15 23:49:0314.3714112216.50.50.50.000.00.317.8016.716667
52017-03-25 20:34:112017-03-25 20:42:1162.30116123619.00.50.52.060.00.312.368.000000
62017-05-03 19:04:092017-05-03 20:03:47112.83179241147.51.00.59.860.00.359.1659.633333
72017-08-15 17:41:062017-08-15 18:03:0512.981237114116.01.00.51.780.00.319.5821.983333
122017-06-09 19:00:262017-06-09 19:20:1113.00113148115.01.00.53.350.00.320.1519.750000
132017-11-06 23:35:052017-11-06 23:42:5712.3912092519.50.50.52.160.00.312.967.866667
162017-08-15 19:48:082017-08-15 20:00:3713.60116341112.51.00.52.850.00.317.1512.483333
182017-04-10 18:12:582017-04-10 18:17:3920.63126326225.01.00.50.000.00.36.804.683333
192017-03-05 04:01:072017-03-05 04:14:1122.7717968111.50.50.53.200.00.316.0013.066667
202017-12-30 23:52:442017-12-30 23:58:5711.10116623826.50.50.50.000.00.37.806.216667
tpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amountduration_minutes
226812017-06-09 18:24:492017-06-09 18:36:1511.79123414419.51.00.51.000.000.312.3011.433333
226832017-08-03 17:30:042017-08-03 17:41:5211.17110717018.51.00.52.060.000.312.3611.800000
226842017-08-03 16:36:322017-08-03 16:46:2321.201685018.01.00.51.950.000.311.759.850000
226852017-07-05 22:42:462017-07-05 22:49:2911.0111447916.50.50.51.560.000.39.366.716667
226862017-02-08 18:13:262017-02-08 19:34:11510.64117070152.01.00.514.845.540.374.1880.750000
226882017-08-05 21:23:292017-08-05 21:26:1130.44123016324.00.50.50.000.000.35.302.700000
226912017-01-06 01:50:142017-01-06 01:56:4712.1211707918.00.50.50.000.000.39.306.550000
226922017-07-16 03:22:512017-07-16 03:40:5215.70124917119.00.50.54.050.000.324.3518.016667
226932017-08-10 22:20:042017-08-10 22:29:3110.89122917017.50.50.51.760.000.310.569.450000
226942017-02-24 17:37:232017-02-24 17:40:3930.6114818624.01.00.50.000.000.35.803.266667